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(54) Title: IMAGE COMPARING SYSTEM 
(57) Abstract 

A method and system for quickly comparing a target image (110) with candidate 
images in a database, and for extracting those images in the database that best match the 
target image. The method uses a fundamental comparison technique (170) based on the 
decomposition of the images into "blobs". A given image is modified so as to reduce 
detail. Cohesive regions of the reduced-detail image are transformed into uniform-color 
blobs. Statistics are generated (15) for each such blob, characterizing, for example, its area, 
color, location and shape, and also, optionally, measures of the texture of the corresponding 
area in the original image. An image-similarity score is computed for any pair of images 
from the blob-specific image statistics. The image-similarity measure is computed by 
placing the blobs of the target image in one-to-one correspondence (510) with blobs of 
the candidate image, generating blob-similarity scores (520) over these paired blobs from 
the pre-computed blob-specific statistics of the images, and generating an overall image 
similarity score (600) as a function of the blob-similarity scores. 
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IMAGE COMPARING SYSTEM 



BACKGROUND OF THE INVENTION 
The present invention relates generally to image processing techniques for 
comparing images, and in particular to a method and system for extracting from an image 
database a set of images that are close in appearance to a target image. 

The general problem to be solved is that of retrieving from a large and diverse 
database of images all of the images that share certain properties. Attempts have been made 
to solve this problem by assigning to each image a set of keywords at the time it is inserted 
into the database. Images are then judged to be similar if they are tagged with the same 
keywords. The problem with this method is that it is impossible to encapsulate in a few 
words everything about the image that might be used as a basis forjudging image similarity. 
For example, a picture of a car on a beach may be tagged with the key words "car" and 
"beach", but probably will not be tagged with such terms asj^brown pebbly beach" or "beach 
next to lake with blue green water" or "beach on the left; lake on the right". People see a lot 
of things they do not commonly put into words. However, actual image comparison is often 
based on just these non-verbal attributes on an image, e.g., on what the image is like instead 
of how the image would be described in words. 
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The advent of databases of digital images on computers makes it possible to 
compare images on the basis of their actual visual attributes (colors, textures, shapes, etc.). 
This pemiits image search by example; the operator of a computer image search system 
selects a given target image and then requests the computer system to find all images in the 
database v^hich resemble the example 

It is, however, difficult to design a successful search-by-example system. The 
problem is that a human being in deciding whether or not two images are similar processes 
the data in an image in a complex manner. Color, shape, texture, etc. are interdependent in 
the effect they exert on a person's judgment of image similarity. 

Existing prior-art systems, however, have placed too much emphasis on single 
sets of features of the images being compared. A particular prior-art technique consists of 
generating a frequency diagram or histogram of all the colors in the image. Two images are 
judged to be similar if their color histograms are similar. Such techniques ignore the shapes 
of objects in the scene, and hence do a poor job of imitating a human's methodology for 
comparison of images. Other techniques look for image shapes of a specific type, for 
example, human faces or thumbprints. These methods do analyze objects in the image, but 
they are limited to the specific task of the identification of specific target objects. 

Many prior-art computer techniques also require extensive analysis of 
candidate images at search time and hence are slow. 

Accordingly, it is clear that what is needed in the art is an improved 
methodology for extracting images firom a database that is both quicker and more like real 
human image-matching methodology than is the prior art. 
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SUMMARY OF THE ESTVENTION 

The present invention provides a method and system for quickly comparing a 
target image with candidate images in a database, and for extracting those images in the 
database that best match the target. 

The invention associates with each image a set of image statistics 
characterizing the image. Hence the invention is similar to prior-art keyword-tagging search 
schemes in the sense that a set of characteristics is assigned to each image. In the present 
invention, however, the selection of this set of characteristics is based on algorithmic 
examination and decomposition of the image by a computer program and is not subject to 
human idiosyncrasies or errors. When a target image is selected or inputted, the same 
decomposition is done to it. 

The process that associates with each image a set of image statistics makes 
use of decomposition of the image into a set of "blobs". Each blob is a cohesive area of the 
original image (roughly uniform in color, or boimded by a distinct edge) which can be 
transfomied into an exactly uniform-in-color region in the decomposed image. Each cohesive 
region or blob is characterized by a limited set of numerical parameters (e.g. x and y extent, 
center of gravity, color, shape, texture, etc.). The set of blobs in the image, along with the 
characterizing statistics of each blob, constitute the characterizing statistics for the image. 

An image-similarity score is calculated for any pair of images based on a 
comparison of the image statistics of the two images. The computation of an image- 
similarity score between two unages typically comprises the three steps of (a) placing the 
blobs of the two images in one-to-one correspondence, (b) computing a similarity score for 
each pair of blobs, and then (c) obtaining an overall similarity score for the two images as a 
function of the similarity scores of the paired blobs in the two images. The user is able to 
modify aspects of the image-comparison algorithm by varying the weights assigned to the 
parameters (e.g., size, color, position, etc.) used in generating an image-similarity score. 

A computer-implemented method is provided for selecting from a computer 
database of candidate images one or more images which closely match a target image. The 
method typically includes the steps of extending the image database by computing, for each 
candidate image, image-characterizing statistics and adding the statistics to the database; 
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computing image-characterizing statistics for the target image; computing, for each candidate 
image, a measure of its similarity to the target image, wherein the measure is computed as a 
function of the image-characterizing statistics of the target image and of the candidate image; 
and displaying at least a portion of one or more of the candidate images having the best 
5 image-similarity measures. 

An image processing system is also provided. The image processing system 
typically includes a memory for storing a plurality of candidate images and image- 
characterizing statistics associated with each candidate image; and input means for inputting 
a target image for comparison with the candidate images. The system also typically includes 

10 a microprocessor coupled to the memory and the input means, wherein the microprocessor 
computes image-characterizing statistics for the target image, and wherein for each candidate 
image the microprocessor determines a measure of the similarity of the candidate image to 
the target image, wherein the similarity measure is computed as a function of the image- 
characterizing statistics of the target image and the image-characterizing statistics of the 

15 candidate image; and a display for displaying at least a portion of one or more of the 
candidate images having the best image-similarity measures. 

Reference to the remaining portions of the specification, including the 
drawings and claims, will realize other features and advantages of the present invention. 
Further features and advantages of the present invention, as well as the structure and 

20 operation of various embodiments of the present invention, are described in detail below with 
respect to the accompanying drawings. In the drawings, like reference numbers indicate 
identical cr functionally similar elements. 



25 

BRIEF DESCRIPTION OF THE DRAWINGS 
Figure 1 illustrates an exemplary image processing system for extracting 
images from a database "by example" according to an embodiment of the present invention; 

Figure 2 is a flowchart showing the process of analyzing images to be stored 
30 in the database; 
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Figure 3 is a flowchart showing the process of obtaining a target image, 
generating statistics for it, comparing it with images stored in the database and displaying the 
result; 

Figure 4 illustrates a lion cub image and an owl image and accompanying 
statistics after reduction of the images to blobs; 

Figure 5 shows the computer display a user might see after seeking a set of 
twenty candidate images matching the lion cub image; 

Figure 6 illustrates the image match controls in an embodiment of the 

invention; 

Figure 7 is a flowchart showing the process of comparing the target image 
with images stored in the database; and 

Figure 8 is a flowchart showing the process of generating match scores for 

blob pairs. 



DESCRIPTION OF THE SPECIFIC EMBODIMENTS 
Figure 1 illustrates an embodiment of an image processing system for 
implementing the image processing and comparison techniques of the present invention. 
Image processing system 70 includes a computer system 71 comprising a microprocessor 72 
and a memory 74. Microprocessor 72 performs the image processing and memory 74 stores 
computer code for processing images. Computer system 71 is any type of computer, such as 
a PC, a Macintosh, laptop, mainframe or the like. Imaging system 70 also includes a scanner 
80 for scanning images directly. Computer system 71 is coupled to monitor 76 for displaying 
a graphical user interface as well as images. Computer system 71 is also coupled to various 
interface devices such as internal or external memory drives^^a mouse and a keyboard (not 
shown). Printer 78 allows for the printing of any images as required by the user. Cable 82 
provides the ability to transfer images to and fi"om another computer device via e-mail, the 
Internet, direct access or the like. 
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Figure 2 is a flowchart showing the process of analyzing images to store their 
characteristic data according to an embodiment of the present invention. The process breaks 
down into the following general steps as shown in Figure 2: 

a. Image insertion; 

b. create "blobs"; 

c. analyze "blobs"; and 

d. store results. 

In step 10, an image is provided to the imaging system. In one embodiment, 
the image is provided by selecting an image from an existing collection of images stored in a 
memory. Altematively, an image could be provided to the imaging system using a color 
scanner, digital camera, paint program, or the like. 

Once an image is provided, the sequence of operations outlined by box 15 of 
Figure 2 will result in the generation of a set of statistics characterizing the image. This 
sequence of operations is decomposed into the specific steps described below. 

At step 20, the image is resized to a standard size while maintaining the aspect 
ratio. The scale factor is stored for later comparisons based on size. In one embodiment, the 
image is reduced to a maximum 64-by-64 pixel resolution. Other resolutions may be used as 
desired. There is a tradeoff between the speed and the accuracy of the image comparison 
process. Smaller resolutions provide for increased speed with some loss of accuracy. 
Maintenance of the aspect ratio means that if the original image is non-square, then the 
longer axis of the reduced image will have the designated size (e.g., 64 pixels) and the 
shorter axis will be proportionately smaller. 

In step 30, detail is removed from the reduced-size image. In one 
embodiment, the image is blurred using a 10-pixel radius Gaussian blur filter. This 
effectively removes most of the detail of the image while keeping the dominant colors mostly 
intact. Altematively or additionally, a median filter may also be used to blur the image. In 
another embodiment, an edge-preserving blur is used to reduce detail. One embodiment uses 
a Sigma filter as an edge-preserving blur; each pixel is replaced by a mean value of all pixels 
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(a) which are within a given distance of the target pixel, and (b) whose color differs from the 
color of the target pixel by less than a specified amount. 

In step 40, the blurred, reduced-size image is decomposed into a set of 
cohesive regions or "blobs". In one embodiment, blobs of identical color are generated by 
5 reducing the number of different colors in the image to a small number. This is done 
according to one embodiment by using resampling techniques developed originally for 
computer video displays. Early computer video displays had a small palette of distinct 
displayable colors. In some such displays the number of displayable colors was limited to a 
value such as 64 or 256, but each color in this limited palette could be chosen at run time 

10 from millions of candidate colors. Hence, technologies were developed to reduce the set of 
perhaps millions of distinct colors in an image to a representative set of, say, 256 of these 
colors. One embodiment of this invention uses one such image resampling technique, a 
median-cut algorithm, as is described, for example, by James D. Foley, van Dam, Feiner and 
Hughes, in Computer Graphics, Principles and Practice, Addison- Wesley, 1995, at p. 600, 

15 the disclosure of which is hereby incorporated by reference. Although more colors can be 
used, in preferred aspects the number of colors should be less than about 10 to speed the 
subsequent image-match algorithm. Note that these colors can be completely different across 
images. The image is now divided into a set of areas of sohd color, i.e., blobs. These blobs 
are catalogued using a flood-fill type algorithm, as is well-known in the art and which is 

20 described in Foley, van Dam, Feiner and Hughes, op. cit,, pp. 979-980. 

An alternative embodiment for the blob-generation process of step 40 employs 
an adaptive color seed-fill algorithm, thus eliminating the need for image resampling. In this 
embodiment, the image is scanned pixel by pixel, left to right, top to bottom. The first pixel 
in the image, at the top left, is taken to be the first pixel of the first blob. The second pixel 

25 scanned is added to the first blob if it is sufficiently similar in color to the first pixel. 

Otherwise, it becomes the first pixel of a second blob. A pixel scanned subsequently is 
added to the blob enclosing one of its adjacent already-scanned neighbor pixels if its color is 
sufficiently similar to the color of the adjacent pixel. Otherwise it becomes the first pixel in a 
new blob. This algorithm is a variant of a seed-fill algorithm as is well-known in the art and 

30 as is described in Foley, van Dam, Feiner and Hughes, op. cit., pp. 979-980. This algorithm 
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varies from standard seed- fill algorithms in its adaptive property. Standard seed-fill 
algorithms cease adding pixels to an area when a pixel is encoimtered that fails a fixed test; 
e.g., the test might be that the pixel have a color not too different from black. The seed fill 
algorithm used in this embodiment is adaptive in the sense that the test for inclusion of a 
5 pixel into the blob enclosing a neighbor pixel depends on the color of the neighbor pixel. 
Hence, because the colors of the pixels within a blob vary at this stage, the test for inclusion 
or exclusion of pixels adapts itself depending on the color of the target pixel. A result is that 
an original-image area of gradually-changing color (e.g., a vignette, gradient, or ramp) may 
be parsed by the blob-generating algorithm as a single blob. 
10 At the end of the blob generation step 40 the entire reduced-size image has 

been partitioned into a set of blobs; every pixel in the reduced-size image has been assigned 
to a blob. 

When the blob-generation step 40 is completed, step 45 is entered. Step 45 is 
used to ascertain whether a pre-specified set of criteria concerning the total number of blobs 

15 in the reduced-size image has been achieved. If so, flow of control passes to the blob- 
analysis step 50. If not, steps 30 and 40 are repeated, but with the parameters of the detail- 
removal and blob-generation algorithms modified so that on the subsequent pass through 
steps 30 and 40 the image will be decomposed into a smaller total number of blobs. For 
example, if the adaptive color seed-fill algorithm is used to generate blobs, then on each 

20 iteration through step 40 it may be prograirmied to be more liberal and less discriminating in 
the criteria it applies when deciding whether or not to add a given image pixel to an existing 
blob. The system is programmed to cycle through steps 30, 40 and 45 xmtil the 
predetennined goal has been reached, or until a predeteraiined maximum number of cycles 
have been taken. Control then passes to step 50. 

25 On each iteration through steps 30 and 40, the total rnunber of blobs in the 

decomposed image declines. (Strictly speaking, the number .^ither declines or stays the 
same.) The goal of iterating over steps 30 and 40 is, roughly speaking, to reduce the number 
of blobs to a predetermined maximum number. In one embodiment, the number of blobs is 
preferably reduced to ten blobs. However, any number of blobs can be used. Additionally, 

30 in many embodiments it is efficient to impose a halting criterion that does not refer explicitly 
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to the target number of blobs into which the reduced-size image has been decomposed. For 
example, one such halting criterion is that the largest p blobs (for example, the largest 10 
blobs) occupy an area equal to a pre-specified proportion (e.g., 75%) of the reduced-size 
image. 

If the minimal perimeter adaptive fill seed algorithm has been used for blob 
generation there is no guarantee that each individual blob remaining in the image at the 
beginning of step 50 will be filled with pixels of identical color. However, it is required that 
at the beginning of step 50 a mean color for each blob be known. Hence, in this case, there 
may be an additional step between steps 45 and 50 comprising replacing the colors of pixels 
within each blob with the average of the colors of all pixels in the blob. Alternatively, the 
average color of each blob may be computed as the blob is constructed, so that the mean 
color of the blob is known before step 50 is entered. 

Figure 4 shows two images after they have been reduced to 64-by-64 pixel 
resolution, ten-blob images. Image 200 of Figure 4 is the blob image of the original lion cub 
image 300 shown in Figure 5 (and of thumbnail image 305 of Figure 5). Image 210 of 
Figure 4 is the blob image of the great homed owl thxmibnail image 310 shown in Figure 5. 

In step 50, the characteristics (e.g., color, size, center of gravity, moment of 
inertia, texture, etc.) of the blobs are determined, and a numerical view of the blobs is 
created. For efficiency, in one embodiment, step 50 is combined with step 40; i.e., the image 
statistics are in fact generated as the blobs are being generated. 

The numerical view of each image created in step 50 is stored (usually, but 
not necessarily, in a database) in step 60. 

Figure 4 shows statistics for the four largest (amongst ten generated) of the 
blobs in the lion cub image 200 and the owl image 210, and, in addition, other statistics 
generated after matching the two images. Column 0, headed "Match" enumerates the matches 
between the largest four blobs of the image, in order, with the best match shown first. 
Column 1, headed "Blob" shows which blobs are matched in each Match. The first two 
entries in the "Blob" column as shown are zero and zero, indicating that the match is 
between blob 0 of image 0, background area 202 of lion cub image 200, and blob 0 of image 
1, background area 212 of owl image 210. The next column headed "ValA" shows an overall 
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match score for the two blobs. The next column headed " Val" shows a normalized match 
score, ValA divided by an Area measure, for the two blobs. The next column headed "Area" 
shows the areas in pixels of the two blobs. Subsequent columns show the statistics 
summarized below (in each case the statistic characterizes a blob): 
5 X: the X position of the center of gravity; 

Y: the Y position of the center of gravity; 

H: the hue (a color measure); 

S: the saturation (a color measure); 

V: the value (a color measure); 
10 Xe: the X extent, in pixels; 

Ye: the Y extent, in pixels; 

Mo: the moment of inertia; 

Ra: the minimum radius; 

An: the angle from the horizontal of the major axis; 
1 5 Sk: the skewness; 

The image statistics illustrated in Figure 4 exemplify one embodiment. Other embodiments 
will vary. The shared goal in the various embodiments is to include statistics measuring for 
each blob its size (Area in the example), location (X and Y in the example), color (H, S and 
V) in the example, and shape (Area, Xe, Ye, Mo, Ra, An and Sk in the example). Other 
20 embodiments add to this list a set of measures of the textures of blobs. 

The above process of image statistics generation, as shown in box 1 5 of 
Figure 2, is repeated for each image desired to be stored. 

After all information has been created a user inputs a target image desired to 
be matched with the collection of stored images. The target image is analyzed as above. 
25 Figure 3 is a flowchart showing the process of analyzing and comparing the target image 
with a collection of stored images. The matching process bii^gaks down into the following 
general steps as shown in Figure 3 : 



30 



a. Generate image statistics; 

b. obtain user requirements (e.g., color important, position important, etc.); 
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c. compare to stored images; and 

d. display results of best/closest match(es). 

According to one embodiment, in step 110 of Figure 3 the target image is 
provided by selecting an image from a pre-existing collection of images. Alternatively, the 
target image is provided to the imaging system using a color scanner, digital camera, paint 
program, or the like. 

Once the target image is provided, the target image is subjected in step 115 of 
Figure 3 to the same sequence of image statistics generation operations as were applied to 
database images in step 15 of Figure 2. 

At step 150 of Figure 3, the numerical results of the statistic-generating step 
1 15 are cached in computer memory for later comparison to the same statistics generated for 
images in the database, which statistics were stored in the database at step 60 of Figure 2. 

In step 160, the specific requirements of the image processing system operator 
are obtained. The user has control in determining which search parameters are most 
important (e.g., whether color and/or location are the most important parameters when 
searching for matches). A set of sliders, such as are shown in Figure 6, is presented to the 
user to permit setting of the importance of various factors to be used in the comparison of the 
target image with candidate images. These factors include, for example: 

1. The maximum number of candidate image matches to display (e.g., the 
"Max Ids to Return" slider 400 in Figure 6). 

2. The maximimi number of blobs to compare (e.g., the "Max Blobs to 
Compare" slider 405 in Figure 6). 

3. A measure of the importance of color in the match (e.g., the "Color 
Weight" slider 420 in Figure 6). 

4. A measure of the importance of position \jx the match (e.g., the "Location 
Weight" slider 415 in Figure 6), affecting how the center of gravity parameter is 
weighted in the matching computation. 

5. Measures of the importance of shape in the match (e.g., the "Area", 
"Extents", "Inertia", "Radius", "Angle", and "Skew" sliders 410, 425, 430, 435, 440 
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and 445, respectively, of Figure 6. These affect how the moment of inertia and x and 
y extents, etc., are used in the match). 

In step 170, once the statistics for the target image have been determined, the 
5 given target image is compared with all stored candidate images and a match score is 
generated for each pair of the form (target image, candidate image) . 

Figure 7 is a flow chart displaying the details of the "Compare with Images in 
Database" step 170 of Figure 3. An image match score for each pair of images is generated 
from the similarity scores of the paired blobs from the two images. Consequently, before an 

10 image match score is generated it is necessary to place all or some of the blobs from the two 
images into one-to-one correspondence. The correspondence is such that similar blobs from 
each image are paired with each other. In one preferred embodiment 1 0 blobs are generated 
for each image, and four blobs from each image are placed in one-to-one correspondence 
with each other. The general rule is that if p blobs are generated for each image, then n blobs 

15 from each image, n < p, are placed in one-to-one correspondence with each other. The 

former number p is the nxmiber of generated blobs in the image and the latter nimiber n is the 
number of significant blobs in the image. The process of placing the significant blobs in one 
to one correspondence is shown as step 510 of Figure 7. 

In step 510, the n significant blobs are placed in one to one correspondence. 

20 This requires as input a set of measures of the similarity of blob pairs. These measures are 
generated at step 500 of Figure 7, the original step in operation 170. 

In Step 500 match scores are developed for pairs of blobs. In one 
embodiment, match scores are generated for all p-by-p pairs of generated blobs, with each 
pair consisting of one generated blob from the target image and one generated blob from the 

25 candidate image. The set of n significant blobs (n < p) to be placed in one-to-one 

correspondence is then chosen on the basis of these match sepres: if the best (largest) match 
score matches blob i of the target image to blob j of the candidate image, then blob i from the 
target is one of the n significant blobs, as is blob j from the candidate. Target blob i and 
candidate blob j are then placed in one-to-one correspondence. This process is repeated until 

30 n blobs from the target have been placed in one-to-one correspondence with n blobs from the 
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candidate. In another embodiment, the n significant blobs to be matched from each image 
are chosen a priori to be the n largest blobs in the image and blob match scores are generated 
for only the n-by-n pairs of these blobs. In this latter embodiment the matching of blobs at 
step 510 is done on the basis of these n-by-n match scores: if the best match score matches 
5 blob i of the target image to blob j of the candidate image, then target blob i and candidate 
blob j are then placed in one-to-one correspondence. This process is repeated until all n 
significant blobs from the target have been placed in one-to-one correspondence with all n 
significant blobs from the candidate. 

Figure 8 shows details of step 500, the process of generating match scores for 

10 pairs of blobs. First, at step 600, for each given pair of blobs, similarity scores are generated 
for each separate statistical component - that is, for each of the several measures which 
collectively measure the area, location, color, shape and texture of a blob. At step 610, an 
overall blob match score is generated from the individual component similarity scores. In 
some embodiments the individual component similarity scores share the same bounds (from 

15 0 to 1, or from 0 to 100), and the overall blob match score is a measure of the mean of the 

individual component scores, either the arithmetic mean or the geometric mean or some other 
measure with the property of a mean. In one embodiment, the latter mean similarity score is 
weighted by the mean areas of the blobs being compared, so as to give a larger similarity 
score to paired large blobs. 

20 After n significant blobs from the target image have been placed in one-to-one 

correspondence with n significant blobs from the candidate image at step 510 of Figure 7, the 
overall image match score for the pair of images is generated at step 520 of Figure 7. The 
overall image match score is generated as a sum or mean (or other increasing fimction) of the 
n match scores for the n paired blobs in the one-to-one correspondence list of step 510. The 

25 blob match scores used are the same ones that were generated at step 500 of Figure 7. 

The set of all candidate-image match scores i§ computed by the operation of 
step 170 of Figure 3, which comprises steps 500, 510 and 520 of Figure 7. When step 520 is 
completed the resulting set of candidate-image match scores is passed to the final step 180 of 
Figure 3. 
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In step 180, the system is programmed to display the candidate images in the 
database identified as having the best matches with the target image (e.g., the top 20, or the 
top 50, etc.) given the user's desired input requirements (i.e., parameter settings). If the 
results are not to the user's liking, the user is able to modify the input parameters and search 
5 again. 

Figure 5 shows one set of displayed results, and Figure 4 shows the associated 
image and match statistics for one match. In the case illustrated by Figure 5 the goal was to 
match the Uon cub image 300 with images in the database. The system returned the 20 best 
matches. Because the target lion cub image 300 is itself a member of the database, the best 

10 match is between the lion cub image and itself, as shown by the thumbnail lion cub image 

305. The best non-trivial match is the second best overall match, between the target lion cub 
image 300 and the great homed owl image 310. 

In conclusion, the present invention provides a simple, efficient solution to the 
problem of extracting images from a data base "by example." While the above is a complete 

15 description of the preferred embodiments of the invention, various alternatives, modifications 
and equivalents may be used. 

In one variant embodiment, the target image is not a photographic image but 
an image painted by the user using computer graphic painting software. In this embodiment 
the user who wants to find a lion cub in the image database first paints an image of the lion 

20 cub and then looks for matches to the painted image. The search for matching candidate 

images can be iterated as the painting progresses; a rough draft of the lion cub painting will 
yield a first set of matches. As detail is added other sets of matches will be found. 

It is often useful to look for image matches based on image textures, such as 
the textures in fabrics, in grass, in sand, or in the bark of trees. Texture matching techniques 

25 are often based on the spectral decomposition, such as can be obtained by a Fourier 

transfonnation, of areas of the image; texture matching can also be done by a process known 
as Wold decomposition, and described in^ wew Wold ordering for image similarity; 
Rosalind W. Picard and Fang Liu, Proc. IEEE Conf. on Acoustics, Speech, and Signal Proc, 
Adelaide, Austraha, April 1994, pp. 129-132, the contents of which are hereby incorporated 

30 by reference for all purposes. Texture-based comparisons are introduced into this invention 
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in the following manner. Once the image has been decomposed into blobs, each such blob 
can be used as an index back into the original image. For example, in Figure 4, blob 202 is 
the body of the lion cub; blob 212 is the body of the owl. Areas of the original images 300 
and 310 of Figure 5 corresponding to each such blob are found, and texture measures are 
5 computed over the indicated areas of the original images. The resulting texture measures are 
added to the set of blob-characterizing statistics, and a texture similarity score is computed 
for each blob pair. Referring to Figure 5, a texture comparison from, on the one hand, the 
bodies of the owl 305, serval 315, second lion cub 320 and puma 325 to, on the other hand, 
the body of the target lion cub 300 will reveal the greater similarities of the fiir-to-fur texture 
10 comparison between cat-cat pairs than the fur-to-feather comparison between the cat-owl 
pair. 

Another variant embodiment modifies the image similarity score algorithm 
and then cycles through the image-comparison step 170 of Figure 3, culling the set of 
candidates to a smaller number on each pass. The very first comparisons between a target 

15 image and the set of candidate images may be a simple and fast culling operation using a 
relatively small set of image statistics over a small number of blobs per image, and basing 
image comparisons on relatively simple measures of differences between blobs. Such a first- 
pass culling operation can be used to reduce the number of candidate images from, for 
example about 1,000,000 to about 100,000. A slightly more-sophisticated set of tests is then 

20 used to reduce the set of candidate images to about 10,000, and so on, until a manageable 
nxmiber of candidate images, for example about 20, remain. 

Another variant of the invention bases the search for candidate images not on 
a single target image but on n = 2 or more target images. The candidate images are then the 
ones that match best to all n target images, as measured by the mean of all n matches, or the 

25 maximum or minimtun of the n matches, or some compound of such measures. 

Other variants of the invention use blob-comparison measures and image- 
comparison measures other than similarity measures. Comparison can be based on 
difference measures just as well as on similarity measures, because difference measures can 
be constructed as inverses of similarity measures. Comparison can also be based on 

30 propinquity measures, since two sets of numbers can be said to be similar to the extent that 
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they are close to each other. Comparison can also be based on distance measures just as well 
as on propinquity measures, since distance measures can be constmcted as inverses of 
propinquity measures. 

In light of the various alternatives, modifications and equivalents to the 
5 present invention, the above description should not be taken as limiting the scope of the 
invention which is defined by the appended claims. 
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WHAT IS CLAIMED IS: 



1 1 . A computer-implemented method of comparing images comprising the 

2 steps of: 

3 (a) providing two images to be compared; 

4 (b) computing image-characterizing statistics for both images; and 

5 (c) computing an overall comparison measure for the two images, said 

6 measure derived from the two sets of image-characterizing statistics, one set 

7 corresponding to each image. 

1 2. The method of claim 1 wherein the overall comparison measure is selected 

2 from the group consisting of a similarity measure, a difference measure, a propinquity 

3 measure and a distance measure. 

1 3. The method of claim 1 wherein the operation of computing image- 

2 characterizing statistics for an image includes the steps of: 

3 (i) decomposing the image into one or more blobs; and 

4 (ii) computing blob-characterizing statistics for each blob in the decomposed 

5 image. 

1 4. The method of claim 3 wherein the step of decomposing the image into one 

2 or more blobs includes the steps of: 

3 modifying the image to reduce detail; and 

4 identifying blobs of the reduced-detail image. 

1 5. The method of claim 4, wherein the step of modif/ing the image to reduce 

2 detail includes resizing the image. 

1 6. The method of claim 4, wherein the step of modifying the image to reduce 

2 detail includes the step of applying to the image one of a Gaussian blur filter, a median 

3 filter, and an edge-preserving blur filter. 

1 7. The method of claim 4 wherein the step of identifying blobs includes the 
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2 step of reducing the nxmiber of distinct colors in the reduced-detail image. 

1 8. The method of claim 7 wherein the step of reducing the number of distinct 

2 colors in the reduced-detail image includes the step of applying to the reduced-detail 

3 image one of a median-cut algorithm, and an adaptive color seed-fill algorithm. 

1 9. The method of claim 4 wherein the steps of modifying the image to reduce 

2 detail and identifying blobs are sequentially and adaptively iterated until a halting 

3 criterion is satisfied. 

1 10. The method of claim 9 wherein the halting criterion is satisfied when at 

2 least a specified proportion of the image is covered by no more than a specified number 

3 of blobs. 

1 11. The method of claim 3 wherein the step of computing image- 

2 characterizing statistics fi-om the decomposed image includes the step of computing at 

3 least one of an area measure, a location measure, a color measure, a shape measure and a 

4 texture measure for each of the one or more blobs. 

1 12. The method of claim 1 wherein one of the two images is provided as a 

2 candidate image and the other is provided as a target image, wherein step (c) of 

3 computing an overall comparison measure for the two images includes the steps of: 

4 (i) applying at least one comparison measure to pairs of sets of one or more 

5 image statistics, one set firom the candidate image and one set firom the target image; and 

6 thereafter 

7 (ii) combining the comparison measures calculated on the pairs of sets of one 

8 or more image statistics into one overall comparison measure between the candidate 

9 image and the target image. 

1 13. The method of claim 3, wherein one of the two images is provided as a 

2 candidate image and the other is provided as a target image, wherein step (c) of 

3 computing an overall comparison measure for the two images includes the steps of: 

4 (i) placing a number, n, of the blobs in the target image in one-to-one 

5 correspondence with n blobs in the candidate image, the one-to-one correspondence 
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6 associating similar blobs from each image; 

7 (ii) for each blob pair, applying at least one comparison measure to pairs of 

8 sets of one or more blob statistics, one set from a blob of the candidate image and one set 

9 from a blob of the target image; 

10 (iii) combining the comparison measures calculated on the pairs of sets of one 

11 or more blob statistic into one overall blob-comparison measure between each blob of the 

12 candidate image and its corresponding blob of the target image; and thereafter 

13 (iv) combining the blob-comparison measures of the paired blobs into one 

14 overall image-comparison measure between the candidate image and the target image. 

1 14. The method of claim 12 or 13, wherein the at least one comparison 

2 measure is selected from the group consisting of a similarity measure, a difference 

3 measure, a propinquity measure and a distance measure. 

1 15. The method of claim 12 or 13 wherein the operation of combining the 

2 comparison measures calculated on the pairs of sets of one or more image statistics into 

3 one overall comparison measure between the candidate image and the target image is 

4 governed by a set of user-modifiable parameters affecting the relative importance of the 

5 pairwise comparisons in the computation of the overall comparison measure. 

1 16. A computer- implemented method of selecting from a computer database 

2 of candidate images one or more images that closely match a target image, the method 

3 comprising the steps of: 

4 (a) computing for each candidate image image-characterizing statistics and 

5 storing each candidate image's image-characterizing statistics in a computer memory; 

6 (b) computing image-characterizing statistics for the target image; 

7 (c) computing for each candidate image an overall comparison measure for the 

8 candidate image and the target image using the image-characterizing statistics of the 

9 candidate image and of the target image, and storing the overall comparison measure for 

1 0 each candidate image in the computer memory; and 

1 1 (d) using the set of overall comparison measures to identify one or more 

12 candidate images which closely match the target image. 
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1 17. The method of claim 16 wherein the operation of computing image- 

2 characterizing statistics for an image includes the steps of: 

3 (i) decomposing the image into one or more blobs; and 

4 (ii) computing blob-characterizing statistics for each blob in the decomposed 

5 image. 

1 18. The method of claim 17 wherein the step of decomposing the image into 

2 one or more blobs includes the steps of: 

3 modifying the image to reduce detail; and 

4 identifying blobs of the reduced-detail image. 

1 19. The method of claim 16 wherein one of the two images is provided as a 

2 candidate image and the other is provided as a target image, and wherein step (c) includes 

3 the steps of: 

4 (i) applying, for each candidate image, at least one comparison measure to 

5 pairs of sets of one or more image statistics, one set from the candidate image and one set 

6 from the target image; and thereafter 

7 (ii) combining the comparison measures calculated on the pairs of sets of one 

8 or more image statistics into the overall comparison measure between the candidate 

9 image and the target image. 

1 20. The method of claim 17 wherein one of the two images is provided as a 

2 candidate image and the other is provided as a target image, and wherein step (c) includes 

3 the steps of: 

4 (i) for each candidate image, placing a nimiber, n, of the blobs in the target 

5 image in one-to-one correspondence with n blobs in the candidate image, the one-to-one 

6 correspondence associating similar blobs from each image; 

7 (ii) for each blob pair, applying at least one comparison measure to pairs of 

8 sets of one or more blob statistics, one set from a blob of the candidate image and one set 

9 from a blob of the target image; 

10 (iii) combining the comparison measures calculated on the pairs of sets of one 

11 or more blob statistic into one overall blob-comparison measure between each blob of the 

12 candidate image and its corresponding blob of the target image; and thereafter 

13 (iv) combining the blob-comparison measures of the paired blobs into one 
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14 overall image-comparison measure between the candidate image and the target image. 

1 21. The method of claim 16 further comprising the step of: 

2 displaying at least a portion of each of the one or more candidate images that 

3 closely match the target image. 

1 22. The method of claim 16 wherein step (c) is adaptively iterated so that on 

2 each iteration the number of candidate images that closely match the target image is 

3 reduced by culling out a number of poorly-matching images. 

1 23. The method of claim 22 wherein on each iteration computation of the 

2 comparison measures is varied so that image comparisons execute faster on early 

3 iterations when there are many candidate images than on later iterations when there are 

4 fewer candidate images. 

1 24. An image processing system comprising: 

2 a memory for storing a plurality of candidate images and image-characterizing 

3 statistics associated with each candidate image; 

4 input means for inputting a target image for comparison with the candidate 

5 images; 

6 a microprocessor coupled to the memory and the input means, wherein the 

7 microprocessor computes image-characterizing statistics for the target image, and 

8 wherein for each candidate image the microprocessor determines an image-comparison 

9 measure for the candidate image and the target image, wherein the image-comparison 

10 measure is computed as a function of the image-characterizing statistics of the target 

1 1 image and the image-characterizing statistics of the candidate image; and 

12 a display for displaying at least a portion of one or more of the candidate 

13 images having the best image-comparison measures. 

1 25. The image processing system of claim 24, wherein when each of the 

2 candidate images is stored to the memory, the microprocessor computes the associated 

3 image-characterizing statistics and stores the associated image-characterizing statistics to 

4 the memory. 
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1 26. The image processing system of claim 24, wherein the microprocessor 

2 includes means for decomposing the image into one or more blobs, and wherein the 

3 microprocessor computes image-characterizing statistics from the decomposed image. 

1 27. The image processing system of claim 26, wherein the image- 

2 characterizing statistics for an image include at least one of an area measure, a location 

3 measure, a color measure, a shape measure and a texture measure for each of the one or 

4 more blobs. 

1 28. The image processing system of claim 24, wherein the microprocessor 

2 includes means for modifying an image to reduce detail so as to produce a reduced-detail 

3 image and means for identifying blobs in the reduced-detail image. 

1 29, The image processing system of claim 28, wherein the means for 

2 modifying the image to reduce detail includes means for resizing the image. 

1 30. The image processing system of claim 28, wherein the means for 

2 modifying the image to reduce detail includes means for applying one of a Gaussian blur 

3 filter, a median filter, and an edge-preserving blur filter. 

1 31. The image processing system of claim 28, wherein the microprocessor 

2 sequentially, adaptively and iteratively modifies the image to reduce detail and identify 

3 blobs, until at least a specified proportion of the image is covered by no more than a 

4 specified number of blobs. 

1 32. The image processing system of claim 24, wherein the microprocessor 

2 determines each image-comparison measure by applying comparison measures to pairs of 

3 sets of one or more image statistics, one set from the candidate image and one set from 

4 the target image, and combining the comparison measures into one overall image- 

5 comparison measure between the candidate image and the target image. 

1 33. The image processing system of claim 26, wherein the microprocessor 

2 determines each image-comparison measure by placing a number, n, of the blobs in the 

3 target image in one-to-one correspondence with n blobs in the candidate image, the one- 
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4 to-one correspondence associating similar blobs from each image, and then for each blob 

5 pair, applies comparison measures to pairs of sets of one or more blob statistics, one set 

6 from a blob of the candidate image and one set from a blob of the target image, and then 

7 combines the comparison measures into one overall blob-comparison measure for each 

8 pair of blobs, and then combines the blob-comparison measures into one overall image- 

9 comparison measure between the candidate image and the target image. 

1 34. The image processing system of claim 32, wherein the microprocessor 

2 adaptively and iteratively determines the image-comparison measure between the 

3 candidate image and the target image, so that on each iteration the number of candidate 

4 images that closely match the target image is reduced by culling out a number of poorly- 

5 matching images. 

1 35. The image processing system of claim 34 wherein on each iteration the 

2 microprocessor varies the computation of the comparison measxires so that image- 

3 comparisons execute faster on early iterations when there are many candidate images than 

4 on later iterations when there are fewer candidate images. 

1 36. An image processing system comprising: 

2 means for providing a target image and at least one candidate image; and 

3 a processor coupled to the providing means, wherein the microprocessor 

4 computes image-characterizing statistics for the at least one candidate image and for the 

5 target image, and wherein the processor computes an overall comparison measure for the 

6 at least one candidate image and the target image, the measure being derived from the 

7 image characterizing statistics of the target image and of the at least one candidate image. 

1 37. The image processing system of claim 36, wherein the processor includes 

2 a means for decomposing an image into one or more blobs, and a means for computing 

3 blob-characterizing statistics for each blob in the decomposed image, wherein the image- 

4 characterizing statistics for the target image and the at least one candidate image are 

5 computed using the corresponding blob-characterizing statistics. 

1 38. The image processing system of claim 37, wherein the means for 

2 decomposing an image includes means for modifying the image to reduce detail and 
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3 means for identifying blobs in the reduced-detail image. 

1 39. The system of claim 38, wherein the means for modifying the image to 

2 reduce detail includes means for resizing the image. 

1 40. The system of claim 38, wherein the means for modifying the image to 

2 reduce detail includes means for applying to the image one of a Gaussian blur filter, a 

3 median filter, and an edge-preserving blur filter. 

1 41 . The system of claim 38, wherein die means for identifying blobs includes 

2 means for reducing the number of distinct colors in the reduced-detail image. 

1 42. The system of claim 41, wherein the means for reducing the number of 

2 distinct colors in the reduced-detail image includes means for applying to the reduced- 

3 detail image one of a median-cut algorithm, and an adaptive color seed-fill algorithm. 

1 43. The system of claim 37, wherein the means for computing image- 

2 characterizing statistics fi-om the decomposed image includes means for computing at 

3 least one of an area measure, a location measure, a color measure, a shape measure and a 

4 texture measure for each of the one or more blobs. 



1 44. The image processing system of claim 36 wherein the processor 

2 determines at least one comparison measure for pairs of sets of one or more image 

3 statistics, one set firom the at least one candidate image and one set firom the target image, 

4 and wherein the processor combines the comparison measures to determine the overall 

5 comparison measure. 

1 45. The image processing system of claim 37, wherein the processor includes: 

2 means for placing a number, n, of the blobs in the target image in one-to-one 

3 correspondence with n blobs in the at least one candidate image, the one-to-one 

4 correspondence associating similar blobs fi"om each image; 

5 means for applying, for each blob pair, at least one comparison measure to 

6 pairs of sets of one or more blob statistics, one set fi"om a blob of the at least one 

7 candidate image and one set fi-om a blob of the target image; 
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8 means for combining the comparison measures calculated on the pairs of sets 

9 of one or more blob statistic into one overall blob-comparison measure between each blob 

10 of the at least one candidate image and the corresponding blob of the target image; and 

1 1 means for combining the blob-comparison measures of the paired blobs into 

12 an overall image-comparison measure between the at least one candidate image and the 

13 target image. 



1 

2 
3 



46. The image processing system of claim 44 or 45, wherein the at least one 
comparison measure is selected from the group consisting of a similarity measure, a 
difference measure, a propinquity measure and a distance measure. 
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